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Abstract. Consider a random graph, having a pre-specified degree dis- 
tribution F but other than that being uniformly distributed, describing 
the social structure (friendship) in a large community. Suppose one in- 
dividual in the community is externally infected by an infectious disease 
and that the disease has its course by assuming that infected individu- 
als infect their not yet infected friends independently with probability p. 
For this situation the paper determines R® and To, the basic reproduc- 
tion number and the asymptotic final size in case of a major outbreak. 
Further, the paper looks at some different local vaccination strategies 
where individuals are chosen randomly and vaccinated, or friends of the 
selected individuals are vaccinated, prior to the introduction of the dis- 
ease. For the studied vaccination strategies the paper determines R v : 
the reproduction number, and r v : the asymptotic final proportion in- 
fected in case of a major outbreak, after vaccinating a fraction v. 



1. Introduction 

Simple undirected random graphs can be used to describe the social net- 
work in a large community (e.g. [IH]), vertices corresponding to individuals 
and edges to some type of social relation, from now on denoted friendship. 
Given such a graph, a model for the spread of the disease may be defined, 
where individuals at first are susceptible but may then become infected by a 
friend. An infected individual has the potential to spread the disease to its 
not yet infected friends before it recovers and becomes immune. The final 
outbreak, both its size and who gets infected, depends on properties of the 
social graph as well as on properties of disease transmission. In order to 
prevent an outbreak it is possible to vaccinate, or immunize in some other 
way, individuals prior to arrival of the disease. Who and how many that are 
to be vaccinated specifies the vaccination strategy. 

The present paper studies questions arising from such modeling. In partic- 
ular, we consider random graphs where the degree distribution (the number 
of friends) follows some pre-specified distribution F, typically having heavy 
tails, but where the random graph G is otherwise uniformly distributed. The 
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epidemic model is the simplest possible model for a susceptible-infectious- 
removed (SIR) disease (e.g. [2]). One randomly selected individual is initially 
externally infected. Any individual who becomes infected infects each of 
his/her not yet infected friends independently with probability p, and after 
that the individual recovers and becomes immune, a state called removed. 
For this graph and epidemic model we study different vaccination strategies: 
the uniform strategy and the acquaintance strategy [7j. In both strategies 
individuals are chosen randomly from the community. In the uniform strat- 
egy the selected individuals are vaccinated and in the acquaintance strategy 
a randomly chosen friend of the selected individual is vaccinated. Both vac- 
cination strategies are local in the sense that the global social network need 
not be known in order to perform the strategy. We also study a vaccination 
strategy where, instead of selecting individuals at random, friendships are 
selected and one or two of the corresponding friends get vaccinated. 

As the population size n tends to infinity, we prove that the initial phase 
of the epidemic may be approximated by a suitable branching process. The 
largest eigenvalue of the branching process, often denoted i?o and called 
the basic reproduction number when applied to epidemics [2], determines 
whether a major outbreak can occur or not: if Rq < 1 only minor outbreaks 
can occur whereas if Rq > 1 outbreaks of order O(n) can also occur with 
positive probability. In case of a major outbreak the total number of individ- 
uals infected during the outbreak, the final size, is shown to satisfy a law of 
large number. The corresponding (random) proportion is shown to converge 
in probability to a deterministic limit tq. Similar results are obtained when 
a vaccination strategy with vaccination coverage v has been performed prior 
to disease introduction. In this situation the strategy-specific reproduction 
number R v , and the major outbreak size t v , are determined. From this it is 
possible to determine the (strategy-specific) critical vaccination coverage v c 
which determines the necessary proportion to vaccinate in order to surely 
prevent a major outbreak, so v c = inf v {v; R v < 1}. 

Stochastic epidemic models on networks with pre-specified degree distri- 
butios have mainly been studied in the physics literature (e.g. [16], [18], 
[7]), Andersson pQ being one exception. Some of the problems studied in 
the present paper have been analysed before whereas others have not, in 
particular the final size proportion t v as a function of v. Beside contribut- 
ing with some new results another aim of the paper is to give formal proofs 
to results which have previously only been obtained heuristically. 

The rest of the paper is structured as follows. In Section [2] we define the 
models for the random graph, the epidemic and the vaccination strategies. 
In Section[3]we present the main results, motivate them with some heuristics 
and give some examples and illustrations. The proofs are given in Sections 
land [5J 



2. Models 
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2.1. Graphs. Let G denote a random multigraph, allowing for multiple 
edges and loops, and let n = \G\ denote the number of vertices of G, i.e. the 
population size. Later we shall consider limits as n — > oo. We define our ran- 
dom multigraph as follows. Let n £ N and let (di) n = (<4 )i be a sequence 
of non-negative integers such that Y^i=i di is even. We define a random 
multigraph with given degree sequence (di)\, denoted by G*(n, (di)%), by the 
configuration model (see e.g. [5]): take a set of di half-edges for each vertex 
i, and combine the half-edges into pairs by a uniformly random matching of 
the set of all half-edges. 

Note that G*(n, (cfj)™) does not have exactly the uniform distribution over 
all multigraphs with the given degree sequence; there is a weight with a fac- 
tor for every edge of multiplicity j, and a factor 1/2 for every loop, see 
[10\ §1]. However, conditioned on the multigraph being a (simple) graph, 
we obtain a uniformly distributed random graph with the given degree se- 
quence, which we denote by G(n, (<ij)"). It is also worth mentioning that 
the distribution of G*(n, (di)™) is the same as the one obtained by sampling 
the edges as ordered pairs of vertices uniformly with replacement, and then 
conditioning on the vertex degrees being correct. 

Let us write 2m := Y17=i d%, so that m = m(n) is the number of edges in 
the multigraph G*(n, (di) n ). We assume that we are given (di) n satisfying 
the following regularity conditions, cf. Molloy and Reed [14} 115]. 

Condition 2.1. For each n, (cZj)" = (di )" is a sequence of non-negative 
integers such that Ya=i di is even and, for some probability distribution 
(Pj)JLo independent of n, and with rij := #{z : di = j}, 

(i) nj/n — > pj for every j > as n — > oo; 

(ii) M : = YtjJPj G (0, oo); 
(hi) 2m/n — > fi as n — > oo. 
(iv) p 2 < 1. 

Remark 2.2. Note that 2m = ^i^i = 12jj n j- Thus, Condition 1 2 . 1 1 implies 
that the sum ^2jjnj/n converges uniformly for n > 1, i.e. 

lim sup> jnj/n = 0. (2-1) 

J^oo n f— ' 
3>J 

Conversely, (|2.ip together with (i) and (ii) implies (iii). (This follows from, 
e.g., [U Theorem 5.5.4], taking X n to be the degree of a random vertex.) 

Note that our condition is slightly weaker than the one in Molloy and Reed 
[141 115] : they also assume (in an equivalent formulation) that if ^2jj 2 Pj < 
oo, then the sums j 2 nj/n converge uniformly; moreover they assume 
that j 2 nj/n — > j 2 pj uniformly. 

Condition 12. II is all we need to study the random multigraph G*(n, (di)™). 
In order to treat the random simple graph G(n, (di)™), which is our main 
model, we need an additional assumption. 
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Condition 2.3. £ 4 d 2 = 0{n). 

Note that J^df = Yljj 2n ji so Conditions 12.11 and 12.31 imply, by Fatou's 
lemma, that Yljj 2 Pj < °°! m other words, the asymptotic degree distribu- 
tion has finite variance. 

When Conditions 12.11 and 12.31 hold, the probability that G*(n, (dj)") is a 
simple graph is bounded away from 0, see Subsection 15.21 for details, and 
thus all results that can be stated in terms of convergence in probability for 
G*(n, (di)i) transfer to the random simple graph G(n, (di)™) too. 



2.2. Alternative graph models. We will in the remainder of the paper 
consider G(n, (di) n ) as our underlying graph model, but we believe that sim- 
ilar results hold for other random graph models too, and that they could be 
proved by suitable modifications of the branching process arguments below. 
Good candidates are the classical random graphs G(n,p) and G(n,m), with 
p = [ijn and m = nfi/2 (rounded to an integer), respectively, and random 
graphs of the general type G(ji,k) defined in [5]. We will not pursue this 
here, and leave such attempts to modify the proofs to the interested reader, 
but we will discuss one interesting case (including G(n,p)) where the result 
easily follow from the results proved below for G(n, (dj)™ ). 

This example is a random graph defined by Britton, Deijfen and Martin- 
Lof [61 Section 3], see also [5j Subsection 16.4], as follows. Let W be a 
non-negative random variable with finite expectation nw := EW. We first 
assign random weights Wj, i = 1, . . . , n to the vertices; these weights are 
i.i.d. with the same distribution as W. Secondly, given {VFj}™ , we draw an 
edge between vertices i and j with probability 

p - W * W * ■ (2 2) 

this is done independently (conditioned on {Wj}) for all pairs {i,j} with 
1 < * < j < n. We denote this random graph by Gw(n). It is easily seen [6] 
that (|2.2p implies that all graphs with a given degree sequence (dj)" have 
the same probability. Hence, if we denote the (random) vertex degrees by 
D%, . . . , D n , then conditioned on D{ = di, i = 1, . . . , n, we have a random 
graph G(n, (dj)™). Moreover, it is not difficult to verify that Condition 12.11 
holds in probability, with (pj)§° the mixed Poisson distribution Vo(iiwW) 
and fj, = fji^r, see [6] Theorem 3.1] and [5j Theorem 3.13]; in other words, 
rij/n pj and 2m/n = n~ 1 ]T\ di — > fi. Assume from now on that Ely 2 < 

oo; it may then be shown by similar arguments that n _1 d\ — > ^(E W 2 -\- 
1). Using the Skorohod coupling theorem, see e.g. [12| Theorem 4.30]), we 
can assume that these limits hold a.s.; hence Conditions 12.11 and 12.31 hold 
a.s. Consequently, by conditioning on (D%, . . . , D n ), we can apply the results 
proved in the present paper for G(n, (di)™), and it follows that the theorems 
below hold for the random graph Gw(n) too, with (pj) and [i as above. 
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With suitable couplings, using for example (1+n 1 ^ 2 )Wj for upper bounds, 
it is easy to see that this remains true if (j2.2[) is modified to 



^:=min(^,l). (2.3) 



n 

Random graphs defined by this definition and minor variations of it have 
been studied by several authors, see [H Subsection 16.4] and the references 
given there. Note that the special (deterministic) case W = ^/JI for a con- 
stant /x > gives the classical random graph G(n, n/n). The results in this 
paper thus holds for G(n,fJ,/n) too, with (p,-) a Po(//) distribution; in other 
words, with D defined in Section El D ~ Po(/x). 

2.3. Epidemic model. We consider an infectious disease that spreads along 
the edges of a graph G. We will in this paper assume that G = G(n, (dj)™ ) is 
the random graph defined above, where we condition the graph G*(n, 

on being simple. The vertices of G are the individuals in the population, 
and the edges represent friendships through which infection might spread. 

The disease has its course in the following way. Initially, one randomly 
chosen individual (vertex) is infected from the outside. This individual then 
spreads the disease to each of its friends independently and with the same 
probability p. Those who get infected make out the first generation infected 
in the epidemic. These individuals then do the same thing to their not yet 
infected friends thus infecting a second generation, and so forth. Note that 
an individual can only get infected once - we then consider such an individual 
either recovered and immune (or dead). This epidemic continues until there 
are no new infections in a generation, when it stops. Since the population is 
finite this happens after a finite number of generations (< n, where n = \G\ 
is the size of the population). The individuals who get infected during the 
course of the epidemic make up the total outbreak, and the number of such 
individuals is called the final size of the epidemic. 

Note that each edge is a possible path of infection at most once, namely 
when the first of its endpoints has been infected. Hence we may just as well 
determine in advance for every edge in G whether it will spread the disease 
or not, provided that one of the endpoints gets infected. Equivalently, we 
may consider the graph G p obtained by randomly deleting edges from G, 
with each edge kept with probability p, independently of the others. The 
final size of the epidemic is thus the size of the component of G p containing 
the initially infected individual. 

2.4. Vaccination strategies. Assume now that a perfect vaccine is avail- 
able. By this we mean that an individual who is vaccinated is completely 
protected from (i.e., immune to) the disease and is not able to spread the 
disease further. We assume that a part of the population is vaccinated be- 
fore the epidemic starts, or as soon as the first individual is infected. The 
epidemic progresses as defined above, with the only difference that infected 
individuals can only infect unvaccinated friends. 
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Note that for the study of the epidemic in the vaccinated population, we 
may simply remove all vaccinated individuals from G (and edges connected 
to these individuals). If we let G v denote the remaining graph, and we 
assume that the initially infected individual x is not vaccinated, the final 
size of the epidemic is thus the size of the component of G y - P := {G v ) p 
that contains x. We thus have to study the combined effect on G of vertex 
deletion by the vaccination and edge deletion by the randomness of infection. 

The goal is to contain the disease, so that the final size of the epidemic 
is small, and it is preferable to do this with a rather small number of vacci- 
nations. For this we look at different local vaccination strategies. The first 
two strategies are local in the sense that they require no global knowledge 
of the social network G (which is rarely available in applications, \\.7\ Sec- 
tion 8.2]) and the latter two selects friendhsips rather than individuals at 
random which may also be thought of needing only local information. We 
let V denote the (usually random) number of vaccinations. 

Uniform vaccination. Let us assume that we sample a fraction c G [0, 1] 
chosen uniformly in the population without replacement and that this frac- 
tion is immunized, so the fraction v being immunized satisfies v = c. This 
vaccination strategy is the most commonly studied vaccination strategy due 
to its simplicity [TTl Section 8.2]. 

More precisely, for convenience, we assume that each individual is vacci- 
nated with a given probability v, independently of each other. The number 
V of vaccinations is thus Bi(n, v), and V/n —> v as n — » oo (with v fixed). 
We denote the remaining graph of unvaccinated individuals by G\j; this is 
thus obtained from G by random vertex deletions. Remember that our main 
concern is with the graph G^. p = (G~}) p ; this is obtained from G by random 
vertex and edge deletions, independently for all vertices and edges. (In this 
case, it does not matter whether we delete edges or vertices first.) 

Acquaintance vaccination. It is intuitively clear that a better vaccination 
strategy would be to vaccinate the individuals with highest degrees (most 
friends) since this would reduce potential spread the most. However, for 
this targeted vaccination strategy to be achievable the whole social graph 
(or at least the degrees of all individuals) would have to be known, and this 
is rarely the case |17l Section 8.2]. A different strategy aiming at vaccinat- 
ing individuals with high degree, but still only using local graph-knowledge 
from selected individuals, proposed by Cohen et al. [7], goes under the name 
acquaintance vaccination. In this vaccination strategy a fraction c of indi- 
viduals are sampled, and for each sampled individual one of its friends, 
chosen randomly among all friends, is vaccinated. Of course it may happen 
that some individuals are chosen more than once for immunization (being 
selected as friends of more than one individual) so the fraction v = v(c) 
actually immunized is smaller than c. This vaccination strategy has two 
slightly different variants depending on whether the "fraction" c is chosen 
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with or without replacement. We will use the version with replacement. For 
this case the "fraction" c may in fact exceed 1 without having everyone vac- 
cinated (individuals who are selected more than once are asked for friends 
independently each time and friends not yet immunized are vaccinated). 
To be precise, we let the number of individuals sampled be Poisson dis- 
tributed Po(cra), with c G [0, oo). Equivalently, each individual is sampled 
Po(c) times, and each time reports a randomly chosen friend. Again, for 
simplicity, we assume that each individual does this with replacement. Con- 
sequently, an individual with degree d will report each of its friends Po(c/d) 
times, and these random numbers are all independent. (An individual that 
is sampled but has no friends is ignored. An individual is only vaccinated 
once, even if he or she is reported several times.) 

For any initial graph G and < c < oo, we denote the remaining graph 
of unvaccinated individuals by G£. We further write G^. p = (G^) p for the 
graph obtained by additional edge deletions. (For acquaintance vaccination, 
the order of the deletions is important, since the vaccination strategy uses 
all edges, without knowing whether they may be selected to transmit the 
disease or not.) 

Edgewise vaccination. In some situations it may be possible to observe, or at 
least sample, the edges representing friendships. If this is the case, another 
reasonable vaccination strategy is to sample a number of the edges and then 
either vaccinate both endpoints or one (randomly selected) endpoint; we 
denote these two versions by El and E2. 

For E2, we assume that we sample each edge with probability 1 — a, 
where a 6 (0, 1] is a fixed number. (Equivalently, we sample Po(cm) edges 
with replacement, with a = e~ c .) For El, we assume for simplicity that 
we sample Po(2cm) edges with replacement; thus each end of each edge is 
sampled with probability 1 — a = 1 — e _c , independently of all other edge 
ends. Hence, for both versions, a vertex with degree d is unvaccinated with 
probability a d , and for El, this is independent of all other vertices. 

For an initial graph G and < a < 1, we denote the remaining graph of 

El E2 

unvaccinated individuals by G a and G a , for the two versions. We further 
write, for j = 1, 2, G a . p = (G a ) p for the graph obtained by additional edge 
deletions. 

3. Main results 

We now state our main results together with heuristic motivations. We 
assume that the underlying graph is the random graph G(n, (dj)™) and that 
Conditions 12.11 and 12.31 hold. Complete proofs are given in Section [3 

3.1. Original epidemic model. Assume that n, the number of nodes, is 
large. The regularity assumption on the degrees of the graph (Condition l2.ip 
implies that no separate node will contain a large fraction of all edges, see 
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(|2.ip . This in turn implies that self loops, multiple edges and short cycles 
will be rare. 

The epidemic starts by a randomly selected individual being infected 
from outside, so this individual has (approximately) the degree distribu- 
tion (pj)j^Q. The friends of this individual, or of any individual, have the 
size biased degree distribution {pj)JL , where 

pj = jpj/J2 kpk - ( 3 - 1 ) 

k 

Let D and D be random variables having these degree distributions respec- 
tively. Then, given that D = d, the number of individuals that the initially 
infected infects is Bi(d,p), and the unconditional distribution is hence mixed 
binomial MixBi(Z),p). Those then infected, as well as infecteds in the fol- 
lowing generations, have degree distribution (fij)j°.Q. Given that D = d, the 
number of individuals an infected individual infects in the next generation 
has distribution Bi(<i — l,p). This follows because the infected was infected 
by one of his friends (which cannot get reinfected) and, since short cycles are 
rare, it is very unlikely that any of the remaining d — 1 friends have already 
been infected. Unconditionally, the number infected in the next generation 
is hence MixBi(Z) — l,p). Further, the property that short cycles are un- 
likely implies that the number of infections caused by different individuals 
are (approximately) independent random variables. 

The above paragraph motivates why the early stages of the epidemic may 
be approximated by a branching process (e.g. [3]), as is common for epidemic 
models (e.g. [2]), and where "giving birth" corresponds to infecting someone. 
The branching process is a simple Galton- Watson process starting with one 
ancestor having off-spring distribution X ~ MixBi(_D,p) and the following 
generations have off-spring distribution X ~ MixBi(L> — l,p). The mean 
of this latter off-spring distribution plays an important role in branching 
process theory and also in in epidemic theory where it is denoted Rq and 
denoted the basic reproduction number. We get the following, using (|3.ip . 

(3.2) 

where fi = K(D) = ^2 k kpk and Var(D) = ^2jj 2 Pj — fJ- 2 (a very related 
expression is obtained in [1]). The branching process is subcritical, critical 
or supercritical depending on whether Rq < 1, Rq = 1 or Rq > 1. For the 
epidemic, this means a major outbreak infecting a non-negligible fraction of 
the community, is possible if and only if Rq > 1. Note that, for fixed n, Rq 
is increasing in Var(D), so the more variance in the degree distribution, the 
higher Rq, and if the degree distribution has infinite variance then Rq = oo 
(a case not treated in the present manuscript due to Condition I2.3|) . 
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The probability tt that the branching process dies out is derived in the 
standard way as follows. First, we derive the probability tt that a branching 
process with all individuals having off-spring distribution X dies out. This 
is obtained by conditioning on the number of individuals born in the first 
generation: for the branching process to die out, all branching processes 
initiated by the individuals of the first generation must die out, i.e. 

oo 

tt = J]VP(X = k). 

k=0 

Let f x (') denote the probability generating function for X, and /r>(-) the 
probability generating function of the original degree distribution D. Then 
we see that tt is a solution to the equation f x (t) = t, and it is known from 
branching process theory (e.g. [31 Theorem 1.5.1]) that it is the smallest 
non-negative such solution. The fact that X is MixBi(Z) — l,p) implies that 

fx(t) = E(t x ) = E(E(t*|£)) = E((pt + 1 - pf- 1 ) = E((l - p(l - t)) 6 ' 1 ). 
Further, 

W a D-i\ = a k-i^Pk = ^_y^ a kP± = d fp(a) = f' D (a) = f' D {a) 
' Y H da^ [i da \i \i f' D {l)' 

In terms of /d(-) the probability tt that the branching process dies out is 
hence the smallest non-negative solution to 

f D (i- P (i-ij) 

— m — (3 - 3) 

The probability tt that the branching process, in which the ancestor has 
different off-spring distribution X, dies out, is obtained from fr by condi- 
tioning on the number of off-spring of the ancestor: 

tt = ^ n k F{X = k) =E(ir x ) =E(E(tt x \D)) =E((^ + l-p) D ) 

k (3.4) 
= /d(1-p(1-*)). 

We now look at the final size of the epidemic in case it takes off, corre- 
sponding to the case that the branching process grows beyond all limits. We 
do this by considering the epidemic from a graph representation. The social 
structure was represented by a random graph G. If this graph is thinned by 
removing each edge independently with probability 1-pwe get a thinned 
graph denoted G p . Edges in G p represent potential spread of infection: if 
one of the nodes get infected from elsewhere, its neighbour will get infected. 
As a consequence, the final outbreak of the epidemic will consist of all nodes 
in G p that are connected to the initially infected. From random graph the- 
ory it is known that if i?o > 1 there will be exactly one connected component 
of order n, the giant component, and all remaining connected components 
will be of smaller order. If Rq < 1 there will be no giant component. The 
initially infected was chosen uniformly in the community so it will belong 
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to the giant component with a probability that equals the relative size of 
the giant component. On the other hand, the initially infected belongs to 
the giant component if and only if its branching process of new infections 
grows beyond all limits, and we know from before that this happens with 
probability 1 — tt defined in equation (|3.4p . From this it follows that the 
asymptotic final proportion infected, r, equals 1 — tt. So, t is both the prob- 
ability of a major outbreak, and the relative size of the outbreak in case a 
major outbreak occurs. 

The above arguments motivate the following theorem, which is proven in 
Section O and where Z n denotes the final number infected in the epidemic. 

Theorem 3.1. If Ro < 1 then Z n /n -4 0. If Rq > I, then Z n /n converges 
to a two-point distribution Z for which ¥(Z = 0) = tt and ¥(Z = r) = r, 
where tt is defined by (13.30 and (|3.4[) and r = 1 — tt. 

3.2. Uniform vaccination. Prior to arrival of the infectious disease, each 
individual is vaccinated independently and with the same probability v 
which implies that the total number of vaccinated V is Bi(n, v), and from 
the law of large number the random proportion vaccinated V/n v. 

Vaccinated individuals, and edges connecting to them, can be removed 
from the graph since there will be no spreading between these individuals 
and their friends in either direction. As a consequence, an individual who 
originally had d friends now has Bi(d, 1 — v ) unvaccinated friends. If an 
individual gets infected during the early stages of the epidemic he will infect 
each of his unvaccinated friends independently with probability p. Given 
that the initially infected has degree d he will hence infect Bi(ci,p(l — v)) 
friends, so without the conditioning he will infect a mixed binomial number 
X v ~ MixBi(D,p(l — v)). Similarly, during the early stages an infected 
individual with degree d will infect Bi((i — l 3 p(l — v)), and unconditionally 
an individual has degree distribution {pk}, so the unconditional number he 
will infect X v will be MixBi(Z) - - v)). 

It is seen that we have the same type of distributions as in the case 
without vaccination. As a consequence, all results for the case with uniform 
vaccination can be obtained from the case without vaccination simply by 
replacing p by p(l — v). We hence have that the reproduction number R^. p 
after vaccinating a fraction v chosen uniformly satisfies 



The probability TT^. p that the epidemic never takes off, assuming the initially 
infected has X v unvaccinated friends, is the smallest solution to 




(3.5) 



f D (l-p(l- v )(l-^ ]p )) 
foiX) 



.u 



(3.6) 
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The probability 7r„. p that the epidemic never takes off if the initially infected 
is selected randomly among the unvaccinated is given by 

TT^ = f D (l -p(l -v)(l- Trjjp))), (3.7) 

where 7r^. p is the smallest solution to (13.61) . Finally, the final size is de- 
termined from the probability of a major outbreak as before. This means 
that the final proportion infected (among the unvaccinated!) will converge 
to 1 — 7T^. p in case of a major outbreak. We have the following corollary, 
where Z^(v) denotes the final number infected in the epidemic where each 
individual was vaccinated independently with probability v (0 < v < 1) 
prior to the outbreak, and where the initially infected was chosen randomly 
among the unvaccinated. 

Theorem 3.2. If R^. p < 1, then Z^(v)/{{1 - v)n) 4- 0. If R^. p > 1, 
then Z"(u)/((1 - v)n) converges to a two-point distribution Z^. p for which 
P(Z^. p = 0) = 7r~L and P(£/£L = r}j. p ) = T^. p , where 7r\j. p is defined by (|3.6p 
and (|3.7p and T^. p = 1 — ir\} p . 

3.3. Acquaintance vaccination. Recall that each individual is sampled, 
independently, a Po(c) number of times, where < c < oo, so in total Po(nc) 
individuals are sampled. Each time an individual is sampled a randomly cho- 
sen friend of the individual is selected and vaccinated (unless it already was 
vaccinated). The effect of this strategy is that vaccinated individuals have 
the size biased degree distribution (pj)JL Q , where pj = jPj/^k^Pk rather 
than the original degree distribution {pk} for uniformly selected individuals. 
The proportion vaccinated v = v(c) is obtained as follows. An individual 
avoids being vaccinated if he is not vaccinated "through" any of its friends. 
The friends of the individual have independent degree distributions {pj)JL , 
and the probability of not being vaccinated "through" an individual with 
degree k is e~ c l k . It follows that the probability to avoid being vaccinated 
from one friend equals 

oo oo , 

a = a(c) = V e- c /% = V e ^*M. (3.8) 

k=l k=l ^ 

(Note that a has the same interpretation as for a introduced for the edgewise 
strategies, but it is a different function of c.) If the individual in question 
has j friends it hence avoids being vaccinated with probability a? . The pro- 
portion 1 — v(c) not being vaccinated equals the probability that a randomly 
selected individual is not vaccinated, which hence equals 

00 

l-v(c) = ^2oP Pj = f D {a), (3.9) 
j=o 

where as before /d(-) is the probability generating function of a random 
variable D having distribution (pj)°^ - 
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Note that in this model, given the graph, individuals are vaccinated in- 
dependently of each other (although with different probabilities). It follows 
easily that the actual (random) number V of vaccinated persons satisfy 

V/n — > v(c) as n — > oo. (3.10) 

Hence we will ignore the randomness in V and regard v (c) given by f|3.9|) as 
the proportion of vaccinated persons. 

We now approximate the initial stages of an epidemic, occurring in a 
community having been vaccinated according to the acquaintance strategy, 
with a suitable branching process. To find "the right" branching process ap- 
proximation is harder for the acquaintance strategy because the vaccination 
status of an individual depends on the degrees of its friends. We therefore 
introduce some convenient terminology. 

We say that transmission may take place through an edge, and through 
its two half-edges, if it is one of the edges in G p , i.e., one of the randomly 
selected edges which will spread the disease if one of its endpoints is infected. 
(Recall that we may assume this random selection to take place before the 
start of the infection.) Further, there is a natural correspondence between 
half-edges and directed edges, with a half-edge corresponding to the edge it 
is part of, directed so that the it begins with this half-edge. We say that a 
directed edge, or the corresponding half-edge, is used for vaccination, if the 
person at the start of the edge is selected and names the person at the end 
of the edge, who thus gets vaccinated. 

It turns out that a suitable "individual" in the branching process is an 
unvaccinated person together with a directed edge from this person such 
that transmission may take place through the edge but it is not used for 
vaccination. It is worth noting that a person may be part of several "indi- 
viduals" in the branching process (if the person was not vaccinated and has 
several friends such that the connecting edges satisfy the conditions above). 
See Figure [1] for an illustration of an individual (a) and situations where the 
individual "gives birth" to one (b) and (c) individuals. 

In order to analyse the corresponding branching process we have to deter- 
mine the distribution of how many new "individuals" one "individual" will 
infect during the early stages of the epidemic assuming a large population 
(large n). We know that the "individual" contains an unvaccinated person, 
so the edge in the "individual" has not been used for vaccination backwards, 
i.e. in the opposite direction. As a consequence, we have to condition on 
this, and then the node at the other end of the edge has degree K = k with 
probability 

HK = k)= t e ~ C/ \u =g^, k=l,2,..., (3.11) 

i.e. the size biased degree distribution conditional on not having vaccinated 
backwards. In order for this friend to create new "individuals", it must not 
have been vaccinated by any of his other k — 1 friends (by assumption it was 
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a b c 

Figure 1. a) An illustration of an "individual" in the 
branching process. In b) the left "individual" has one off- 
spring (the up-going edge constitutes no individual since 
there is no transmission and the down-going no individual 
since the friend was sampled and named the individual be- 
low for vaccintation. In c) no individual is born since the 
friend was vaccinated (being named by some other friend). 



not vaccinated from our original individual). This happens with probability 
a k ~ l . Each of the friend's remaining k — 1 edges will be open (i.e., trans- 
mission may take place but it is not used for vaccination) independently, 
each open with probability pe~ c / k . The number of open edges (equal to 
the number of new "individuals") is hence Bi(fc — l,pe~ c / k ). If the friend 
is vaccinated (probability 1 — a k ~ 1 ) no new individuals are born. The un- 
conditional number Y of new "individuals" an individual "gives birth" to, 
i.e. the off-spring distribution of the approximating branching process, can 
be obtained by conditioning on the number of friends our friend has and 
recalling that individuals are born whenever the friend is vaccinated or if 
the binomial variable equals 0: 



00 \ f,,p' c / k 

\k-l\ PkG 



P(Y = 0) = ((1 - c^" 1 ) + - pe~ c/k ) 



k=l 
oo 



a 

nY = j)= f; a^^-^ipe-^yil-pe-^- 1 ^^^, j > 1. 
k=j+i a 

(3.12) 

This off-spring distribution determines both R^ p , the probability of a major 
outbreak, and the final size in case of a major outbreak. For instance, 
the reproduction number is the mean of this distribution, and this mean 
is obtained by first conditioning on the degree of the node in question. 
Given that the degree equals k, the average number of off-spring equals 
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a k 1 (k — l)pe c / k , which gives the following reproduction number: 

k>l fc>l 

(3.13) 

(cf. [7]). Let /y(a) = K(a Y ) be the probability generating function of this 
off-spring distribution. If the epidemic starts by one "individual", i.e. one 
person with one open directed edge, then the probability TT^. p that the epi- 
demic never takes off is the smallest solution to the equation 

^ P = M^ A P )- ( 3 - 14 ) 

If we start with one infected person that is unvaccinated and has degree 
j, then each of its j half-edges is open with probability pe~ c ^ , and the 
probability that a given half-edge does not start a large epidemic is 1 — 
p e - c /i _|_ pe~ c ^TT^. p , so the probability that the epidemic never takes off 

equals (1 — pe~~ c ^ (1 — 7r£p)} J ', for j > 1, and 1 for j = 0. 

If instead the initially infected is chosen randomly among the unvacci- 
nated as we assume, then the probability that it has degree j is p^a? j ^ • pjcx 3 , 
cf. (13. 9p . and thus the probability that the epidemic never takes off equals 



A Po + Ej>iP^^-pe- c/ Hl-^ P )Y .„._, 

" = ' ( } 

Finally, using the same reasoning as before, the limiting proportion infected 
in case of a major outbreak equals t£„ = 1 — 7r^ p . We summarize our 
results in the following theorem, proved in Section [5l where Z^(c) denotes 
the final number infected in the epidemic where vaccination is done prior 
to the outbreak according to the acquaintance vaccination strategy. Recall 
that < c < oo and that v(c), the proportion of the population vaccinated, 
is given by (|3.9[) with a = a(c) given by f|3.8[) . 

Theorem 3.3. Z£(c)/((1 - v{c))n) 4- if R* p < I, where R^. p is defined 
by ()3.13j) . If Rc. p > 1, then Z^(c)/((1 — v(c))n) converges to a two-point 
distribution Z^ p for which P[Z^ p = 0) = 7r^ p and ¥{Z^ p = T^ p ) = r^ p , 
where 7r^ p is defined by (13.140 and (|3.15|) . and r^ p = 1 — 7r* p . 

3.4. Edgewise vaccination. Recall that, for both El and E2, a person with 
d friends is unvaccinated with probability a d (here a has the same meaning 
in the previous subsection, but it can be treated as a free parameter). Thus, 

EV = n^2p d {l - a d ) + o(n) 

d 

and a simple variance estimate shows that the vaccinated proportion 

V /n^v(a) :=Y,Pd(l-a d ), (3.16) 

d 

just as for acquaintance vaccination, see (13. 9j) and A3. 10f> . 
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We define open (directed) edges as for acquaintance vaccination, and ar- 
gue as there with the following modifications. The other endpoint of an 
open edge has just the size-biased distribution (pk)- If this vertex, y say, 
has degree k, it is unvaccinated with probability a 1 , and in that case, 
the number of new open edges originating at y is Bi(/c — l,pa) for El and 
Bi(A; — l,p) for E2. The difference between the two versions is because we 
already know that these edges do not vaccinate y, and for E2, this implies 
that they do not vaccinate their other endpoint either, while for El that is 
an independent event with probability a. 

We thus have the offspring distributions for El and E2, cf. (|3.12p . 



P(n =3) = E Pk^'H ■ )(pcc)i(l-pa) k - 1 -^ j > 1, 

k=j+l \ 3 / 

m=j)= E ^ fe_1 ■ y(i-p)*~ 1_J '> 

l „'_1_1 \ 3 J 



k=j+l 



we leave the formulas for P(Yi = 0) and P(^2 = 0) to the reader. 
This gives the reproduction numbers 



R% p = E(Y X ) = ^2p k a k - l (k - l)pa=pY,{k ~ l)p k a k , 

k>l k 

< p = E(y 2 ) = E^« fc_1 ( fc - i)p = pE( fc " ^p*"*" 1 - 



(3.17) 



fe>i fc 



Note that i2 a . p = atR a . p < R a . p , which shows that, with the same number 
of vaccinations, El is a better strategy than E2. In particular, the critical 
critical vaccination coverage v c is smaller for El than for E2. An intuitive 
explanation to why E2 is not as efficient as El is that in E2 both individuals 
of selected friendships are vaccinated, and since an individual is partly pro- 
tected by friends getting vaccinated the second vaccination is less "efficient" . 

El E2 

We let TT a . p and Tr a . p be the probabilities that the Galton-Watson pro- 
cesses with offspring distributions Y\ and Y2, respectively, starting with one 
individual, die out; they are thus the smallest positive solutions to t = fy 1 (t) 
and t = /r 2 (i), where fy 1 and fy 2 are the corresponding probability gener- 
ating functions. 

If we start with one unvaccinated person x with degree d, the number of 
open edges from x is ~Bi(d,pa) for El and Bi(d,p) for E2, for the same reason 
as for the number of new edges above. The probability that the epidemic 
never takes off is thus (1 — pa + paTT a . p ) d for El and (1 —p + pTx a . p ) d for E2. 
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If the initially infected is chosen randomly among the unvaccinated, we 
thus find the probabilities that the epidemic never takes off 



El 
7T„ 



7T. 



(3.18) 



We summarize our results as before, letting Z^ l {a) and Z^ 2 (a) denote 
the final numbers infected in the epidemic for the two strategies. Recall that 
v(a) is given by (|3.16j) . 

Theorem 3.4. For j = 1,2, Zn j {a)/({1 - v{a))n) A if R% p < 1, where 
R^, p is defined by (I3.17p . If R E ^. p > 1, t/ien Z n \a)/((1 — v[a))n) converges 
to a two-point distribution Zj for which P(Z^ = 0) = n^. p and F(zj. p = 
T a; P ) = T a; P > where n^-p is defined by (|3.18|) . and = 1 - 7rJ. p . 

3.5. Examples. We now compare the performance of the different vacci- 
nation strategies on two examples. In the first example we have chosen the 
degree distribution to be Poisson distributed with mean A = 6, and the 
transmission probability to equal p = 0.5. Using (|3.2p we conclude that 
this implies that Rq = 3. The assumption of Poisson distributed degree 
means that this applies to the simple G(n,p = 6/n) graph with transmis- 
sion probability p = 0.5; in the epidemic literature this model is knowns as 
the Reed- Frost model (e.g. [2J). In Figure [2] we show r, the final proportion 
infected among unvaccinated in case of a major outbreak, as a function of 
the vaccination coverage v, for the 4 different vaccination strategies treated. 
It is seen that the acquaintance and edgewise El strategies perform best in 
the sense that, for a fixed proportion vaccinated, the proportion r getting 
infected in case of a major outbreak is smallest for these two strategies. As a 
consequence, the critical vaccination coverage, v c = ini v {v; R v < 1}, is also 
smallest for these two strategies. There is no unique ordering of the two 
strategies - the acquaintance strategy is slightly better for small vaccina- 
tion coverages and El is slightly better for higher vaccination coverages and 
hence also has slightly smaller v c . The edgewise strategy E2 is not as good 
as these two strategies but still better than the uniform vaccination cover- 
age. (Indeed, E2 is always less efficient than El, see above.) Acquaintance, 
El and E2 all perform better than the uniform strategy, the reason being 
that they tend to find individuals with high degrees. For the parameter 
choices of this example, the critical vaccination coverages equal v c ~ 0.56 
for the acquaintence and El strategies, v c ~ 0.61 for E2 and v c ~ 0.67 for 
the uniform vaccination strategy. 

In the second example (illustrated in Figure [3]) we chose a more heavy 
tailed degree distribution having pd oc d~ 3 ' 5 (in the computations it was 
truncated at d = 200). The initial values were modified such that E(D) « 6 



GRAPHS, EPIDEMICS AND VACCINATION STRATEGIES 



17 




V 



Figure 2. Final proportion infected T ELS cl function of the 
vaccination coverage v for four vaccination strategies: uni- 
form ( — ), acquaintance (•••), El ( ) and E2 (— • — • — ). 

The degree distribution is Po(6) and transmission probability 
p = 0.5. 



to make it more comparable to the previous example, with a resulting vari- 
ance equal to 18.9. The transmission parameters was set p = 0.5 as before. 
Using (|3.2p we hence see that Rq ~ 4.1. In the figure we see the same 
type of pattern as in the previous example. However, the difference between 
the strategies is more pronounced with v c ~ 0.50 for the acquaintence and 
El strategies, v c ~ 0.55 for E2 and v c ~ 0.75 for the uniform vaccination 
strategy. In other words, if the uniform strategy is applied in these two 
examples we have to vaccinate more individuals if the degree distribution is 
heavy-tailed, but if any of the other strategies is performed, the heavy-tailed 
degree distribution require less vaccinations to surely prevent an outbreak. 
Another minor difference from the previous example is that, for the present 
heavy-tailed distribution, the acquaintance strategy is (slightly) better than 
El for all vaccination coverages and hence also has a smaller critical vac- 
cination coverage. However, the difference between the two strategies is 
negligible. 
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Figure 3. Final proportion infected as a function of the 
vaccination coverage for four vaccination strategies: uniform 

( — ), acquaintance (•••), El ( ) and E2 (— • — • — .) The 

degree distribution is heavy-tailed (pd oc d -3 ' 5 ) with mean 
E(D) ps 6 and p = 0.5. 

Note that all r's in both examples denote the proportion of infected among 
the unvaccinated (in case of an outbreak) and can hence be thought of as an 
indirect protection from those getting vaccinated. Of course, by assumption, 
all vaccinated are also protected from getting infected. 

4. Preliminaries on branching processes 

As said above, our method is based on comparison with branching pro- 
cesses, more precisely Galton-Watson processes, see e.g. [3] for definitions 
and basic facts. If X is a Galton-Watson process started with 1 initial par- 
ticle, we let X d denote the same branching process with d initial particles, 
i.e. the union of d independent copies of X. Further, for any Galton-Watson 
process X, we let \X\ denote its total progeny, i.e. the total number of parti- 
cles in all generations, and we let p(X) be the survival probability of X, i.e. 
p(X) := P(|X| = oo). Note that if X starts with 1 particle, then 

p(X d ) = l-(l-p(X)) d , (4.1) 
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since X d dies out if and only if all d copies of X in it do. 

We will need the following simple continuity result, which presumably is 
well known although we have failed to find a reference. 

Lemma 4.1. Let X v and X be non-negative integer-valued random vari- 
ables, and let X d and X d be the corresponding Galton-Watson processes 

with offspring distributions X v and X, starting with d particles. If X v —> X 
as n — ► oo, and F(X = 1) < 1, then p(X d ) —> p{X d ), for every fixed d > 0. 

Proof. By (14. it suffices to show this for d = 1, and we then drop the 
superscript 1. 

Consider the probability generating functions fx (t) := TKt x and fx v (t) '■= 
Et x » for < t < 1. It is well-known, see e.g. O Theorem 1.5.1], that the 
extinction probability q := 1 — p(X) is the smallest root in [0, 1] of fx(o) = Q- 
It follows easily, since we have excluded the possibility fx(t) = t, that if 
< t < q, then f x (t) > t, and if q < t < 1, then f x (t) < t. 

Since X u — > X, we have fx u (t) — ► fx{t) for every t £ [0,1]. Hence, if 
< t < q, then fx v (t) > t f° r large n, and thus q u := 1 — p(X u ) > t. 
Similarly, if q < t < 1, then, for large n, fx„(t) < t and thus < i. It 
follows that — > (? as n — > oo. □ 

Remark 4.2. The case ¥(X = 1) = 1, i.e. X = 1 a.s., really is an exception. 

If we let X v ~ Be(l — we have X v ij = l, but p{X v ) = for every 

^ while /?(3£) = 1. 

5. The giant component 

Our ultimate goal is to describe the large component(s) of G*(n, (dj)i) v;p 
and G(n, (di)i) v - p , where v is one of the vaccination strategies defined above. 
The basic strategy will be to relate the neighbourhoods of a vertex to a 
branching process. We do this for G*(n,(di)i), which is technically eas- 
ier to handle; as explained in Subsection 15.21 the results then transfer to 
G(n, (di)™) too, provided Condition 12.31 holds. We first do the argument 
in detail in the simplest case, viz. G* (n, (di)™) without edge deletion (i.e. 
p = 1) or vaccination and prove our main results concerning the existence, 
size and uniqueness of the giant component. We use and adapt the method 
in Bollobas, Janson and Riordan [5] (for a different random graph model). 
This will provide a new proof of the results by Molloy and Reed [14} [To] 
(under our slightly weaker condition). We will then describe the modifica- 
tions needed to make the results valid also when there is edge deletion or 
vaccination. 

We say that an event holds with high probability (whp), if it holds with 
probability tending to 1 as n — > oo. We shall use o p in the standard way (see 
e.g. Janson, Luczak and Rucihski [UJ); for example, if (X n ) is a sequence 
of random variables, then X n = o p (l) means that X n A 0. We shall often 

P 

use the basic fact that, if a G R, then X n — ► a if and only if, for every e > 0, 
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the relations X n > a — e and X n < a + e hold whp. All unspecified limits 
are taken as n — > oo, while p and the vaccination parameters v or c are kept 
fixed. 

We denote the orders of the components of a graph G by C\{G) > 
CziG) > with Cj{G) = if G has fewer than j components. We 
let Nk(G) denote the total number of vertices in components of order k, 
and write N>k(G) for ^2j >k Nj(G), the number of vertices in components of 
order at least k. Similarly, we let Nk,d(G) and N>k,d(G) denote the number 
of such vertices that have degree d. 

Remark 5.1. Our results are typically of the form C\(G n ) = Tn+o p (n) and 

C2(G n ) = o p (n) for some number r > (or, equivalently, C\(G n )/n — > r 

and C2{G n )/n 0). Hence, if r > 0, then there is exactly one "giant" 
component, and all other components are much smaller. In our epidemic 
setting, this means that if r = 0, then every epidemic will be "small", i.e. 
o(n), while if r > 0, then the epidemic is large with probability r (allowing 
the case that the initially infected person is vaccinated and thus never be- 
comes ill), and in that fraction r of the population will be infected, 
(r thus has a double role.) 

5.1. G*(n,(di)i), with p = 1 and no vaccination. As said above, we 
will use a branching process approximation. The particles in the branching 
process correspond to free (not yet paired) half-edges. Note that there are 
jrij half-edges belonging to vertices of degree j. Hence, a random half-edge 
shares a vertex with j — 1 other half-edges with probability jrij / Y2k ^ n fc- By 
Condition [27T1 jrij / ^2 k kn k — ► jpj/p-, and recall the definition of pj = jpj/ fx 
defined in (|3,ip . Let X be the Galton- Watson branching process starting 
with one particle and with the offsping distribution {pj+ijJLQ. (This is the 
distribution (pj)j size-biased and shifted one step.) In other words, the 
offspring distribution is D — 1, with D as in Section [31 

We let p = p(£) denote the survival probability of X, and define 



this is the survival probability for the branching process X started with a 
random number of particles having the distribution (pd)dLo- 

Consider a vertex x of degree d in G*(n, (dj)™). We explore the component 
containing x by a breadth-first search. We concentrate on the half-edges, 
so we begin by taking the d half-edges at x, and label them as active. We 
then process the active half-edges one by one as follows. We take an active 
half-edge, relabel it as used, and find the half-edge that it connects to and 
the corresponding vertex; this partner is chosen uniformly among all half- 
edges that are not yet used. We then label the partner as used and all other 
half-edges at the same vertex as active, provided that they are not already 
used (which would mean that we have found a cycle or a multiple edge). 




(5.1) 



d=l 
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The active half-edges will behave essentially as a Galton-Watson process 
(where we reveal the children of the particles one by one) , but the probability 
distribution of the children will vary slightly; it will depend on the numbers 
of vertices of different degrees that we already have found. Nevertheless, it 
is obvious that at each step in the beginning, the probability of j — 1 new 
half-edges is close to jnj/^2 k kn^ pj. 

To be more precise, first, let k be a fixed number, and consider the event 
that x belongs to a component with at least k vertices. This is almost the 
same as the probability that we will find at least k — 1 active half-edges 
in the process just described. (This is not exact, because if we stop when 
we have found k — 1 half-edges, some of these may connect back to vertices 
already found; the probability of this tends to 0, however, as n — > oo.) The 
complementary event, that the process finds less than k — 1 active half-edges, 
consists of a finite number of cases, where each case describes the sequence 
of new active half-edges found at each step. It is obvious that the probability 
of each of these cases converges, as n — > oo, to the corresponding probability 
in X d , and thus we find, for a vertex x of degree d, with C(x) denoting the 
corresponding component of G*(n, (di)™), 

P(|C(x)| >k)= ¥{\X d \ > k - 1) + o(l). (5.2) 

Recall that -/V>fc is the number of vertices of degree d belonging to a com- 
ponent of size at least k. The expectation E^j.^ equals n d times the 
probability that a given vertex x of degree d satisfies |C(x)| > k, and thus, 
by (|5.2p and Condition I2.1f i). for every fixed d > and k > 1, 

E(iV> M /n) - Pd F(\X d \ > k - 1). (5.3) 

We next want to let k — > oo here. We thus, for the remainder of this 
section, assume that u(n) is a function such that uj(n) — > oo but u(n)/n — > 
as n — > oo. We regard components as big if they contain at least u(n) 
vertices, and small otherwise. (The flexibility in the choice of u(n) is useful, 
but we will see that it does not matter much; the asymptotics we find do 
not depend onu.) 

Lemma 5.2. If cj(n) — > oo and u{n) jn — > 0, then, 

E(i\T> w(r0 /n) -» r (5.4) 

and, for every fixed d > 0, 

E(N> w{n)jd /n) -^p d F(\X d \ = o )= Pd (l - (l-p) d ). (5.5) 

Proof. We begin with an upper bound in (|5.5p . For any fixed k, we have 
uj(n) > k for large n, and thus N >w ^ n ^ d < N>j. jd . Consequently, (|5.3p yields 

UmsupE(iV> a;(n)]d /w) < limsu P E(iV> M /n) = Pd ¥(\X d \ > k - 1). (5.6) 

n— »oo n—*oo 
As k — » oo, the right hand side converges to p^P(|3t d | = oo), and we find 

limsupE(JV> w(n)i( ,/n) < p d F(|X d | = oo). (5.7) 
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For a lower bound, let v > 1 be fixed and let X v be a random variable 
taking values in {0,1,..., z^} with F(X V = j) = (1 — v~ l )pj + \ for 1 < 
j < v (and a suitable value for P(X U = 0) so that the sum becomes 1). 
Consider the breadth-first exploration process described above. As long as 
we have found less than u>(n) vertices, the number of new active half-edges at 
each step stochastically dominates X u , provided n is large enough, since the 
remaining number of vertices of degree j + 1 is Uj+i — o(n) = Pj+\n — o{n) > 
(1 — v~ l )pj + in for n large. (If Pj+i = 0, the result is trivial.) Consequently, 
letting X d be the Galton-Watson process with d initial particles and the 
number of children distributed as X u , if n is large enough, we can couple 
the exploration process and X d such that as long as we have found less than 
u(n) vertices, the number of active half-edges is at least the number of active 
particles in X d (i.e., the particles whose children have not yet been revealed.) 
In particular, if the exploration process stops before u(n) vertices are found, 
then X d stops, and thus the probability that a vertex x of degree d satisfies 
\C(x)\ < u(n) is at most P(|X^| < oo). Consequently, for large n, 

EN> u , {n)jd >n d F(\Xi\=oo) (5.8) 

and thus 

liminfE(jV >£j(ri) d /n) > Pd n\X d u \ = oo). (5.9) 

Now let v — > oo. Then X v — > X, where X has the distribution ¥(X = j) = 
Pj+l, and thus, by Lemma [41] P(|X^| = oo) — > P(|X d | = oo). Consequently, 

liminf E(N >w{n)4 /n) > p d F(\X d \ = oo), 

which together with (|5.7p and (|4.1[) yields (|5.5f) . 

Finally, noting that N >u ^ n ^ d < n d , it follows easily from the uniform 
summability in (12. ip that we can sum (15.5f) over d and take the limit outside 
the sum, i.e. 

E(N> u{n) /n) = ^E(iV> w(n)id /n) -> ^ W (l - (1 - p) d ) = r. 

d d □ 

Note that the limits do not depend on the choice of oj(n). Hence, it 
follows that the expected number of vertices belonging to components of 
size between, say, logn and n 0,99 is o(n). 

We next show that we have convergence not only of the expectations 
but also of the random variables in (|5.4|) and (15. 5p . i.e. that these random 
variables are concentrated close to their expectations. 

Lemma 5.3. If uo(n) — > oo and u{n) jn — > ; then, 

N> u{n) /n ^ r (5.10) 

and, for every fixed d > 0, 

N> u( n )4 /n^p d {l-(l-p) d ). (5.11) 
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Proof. Start with two distinct vertices x and y of the same degree d and 
explore their components as above. We can repeat the arguments above, 
and find 

P(|C(x)| < k, \C(y)\ <k)= P(\X d \ <k-l) 2 + o(l) 

and thus, using ([5.2ft . 

P(|C(x)| > k, \C{y)\ >k) = F{\X d \ > k - l) 2 + o(l). 

Multiplying with the number nd(nd — 1) of pairs (x,y) of the same degree 
d, and noting that the number of such pairs where both x and y belong to 
components of size > k (the same or not) is N>k d(N>k d — 1)> we find 

HN> k4 /n 2 ) = E(iV> M (iV> M - l)/n 2 ) + 0(1/ n) -> p^G^I >k- l) 2 . 

Hence, limsup^^ E^N 2 ^^ d /n 2 ) < p d ¥(\X d \ > k — l) 2 for every fc, and 
thus 

hmsu P E(iV^ ( „ )i(J /n 2 ) <p 2 P(|X d | = oo) 2 . 

n— >oo 

Since, by the Cauchy-Schwarz inequality and (|5.5p . further 

E(^ w(n)l<i /n 2 ) > (E(JV> w(n)jd /n)) 2 ^^P(|X d | = oo) 2 , 
it follows that 

^L(n),> 2 )-pini^i = oo) 2 . 

This and (j5.5j) show that 

Var(iV> a , ( „) id /n) -> 0, 

and thus 

(^><^(n),d - E(JV> w(n ) >{i ))/n 0, 
which by (15. 5j) implies (15.111) . 

Finally, again we can sum over d because of f)2. If) : this yields (|5.10j) . □ 

Theorem 5.4. Assume that Condition \2. 1\ holds. Then 

d(G*(n, (d i )V)=Tn + o p (n), 

C 2 (G*(n, (di)?)) = o p (n). 

Proof. We have already shown that roughly rn vertices lie in big compo- 
nents. It remains to show that most of them belong to the same component. 
We write G n = G*(n, (dj)?). 

First, if C\{G n ) > u(n), then N >w ^(G n ) > Ci(G n ). Thus, for every 
e > and n so large that uj(n) < en, we have by Lemma 15.31 

P(d(G n ) > rn + en) < P(JV> w(n) (G n ) > rn + en) 0. (5.12) 

This completes the proof if r = 0. 

In the sequel we assume r > and show a corresponding estimate from 
below. First, if pd = for every d > 2, then pj+i = for all j > 1, so X dies 
immediately and p = and r = 0. Hence pd > for some d > 2. We fix 
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such a d for the remainder of the proof, and fix 5 with < 5 < 1/2. Further, 
take (rather arbitrarily) u(n) = n ' 9 . 

We assume in the sequel that n is so large that rid > n 1_<5 . We then split 
the n 1 " 5 first of the vertices of degree d in G n into d vertices of degree 1 
each; we colour these dn 1-5 new vertices red. (To be precise, we should round 
n 1-5 to an integer.) We denote the resulting graph by G' n \ note that G' n is a 
random multigraph G*(n', (rfj)) where n'j, the number of vertices of degree j, 
is given by n' d = rid — n 1_<5 , n! x = ri\ + dn 1_<5 , and n'j = rij for j ^ 1, d. Note 
that the total number of vertices in G' n is n' := n + (d — l)n 1 ~ <5 = n + o(n), 
and that (dQ satisfies Condition 12.11 with the same (pj) (except that n is 
replaced by n' , which only makes a notational difference). Consequently, 
our results above apply to G' n too. 

By symmetry, we may assume that the dn}~ & red vertices in G' n are 
chosen at random among all vertices of degree 1, and that G n is obtained 
by partitioning the red vertices at random into groups with d vertices and 
then coalescing each group into one vertex. 

During the exploration of the component C'{x) in G' n containing a ver- 
tex x, in each step, the active half-edge is paired with the single half-edge 
leading to a red vertex with probability at least c\n~ s , for some c\ > 0, 
unless at least n 1_l5 red vertices already have been found. Consequently, if 
the component C'(x) has at least u(n) vertices, the number of red vertices 
stochastically dominates min(ra 1 ~ <5 , Bi(w(n) — 1, c\n~ 5 )) . A Chernoff bound, 
see e.g. [TH Corollary 2.3], shows that the probability that C'(x) has at least 
u(n) vertices but less than C2n~ s u>(n) = C2n 0,9 ~ s red vertices is at most 
exp(— C3n°- 9_<5 ) = o(n _1 ), for C2 = ci/2 and some C3 > 0. Summing over all 
x, we see that whp, every big component of G' n contains at least C2?i ' 9_5 
red vertices. 

Assume that this holds, and consider two big components K% and K2 in 
G' n . We can construct the random partition of the red vertices by taking 
first the red vertices in K\ one by one, unless already used, and randomly 
selecting d — 1 partners. We thus do this at least m := C2n°' 9 ~ s jd times, 
and each time the probability of not including a red vertex in K2 is at most 
1 — C2n 0,9 ~ s j \dn l ~ 5 ) = 1 — C4?! -0 ' 1 , with C4 = c-ijd. Consequently, the 
probability of not joining K\ and K2 in the coalescing phase is at most 

exp(— mc2n -0 ' 1 ) = exp(— c|n°' 8 ~ 5 ) = o(n -2 ). 

Since there are at most (n') 2 = 0(n 2 ) such pairs K\ and K2, we see that 
whp all big components in G' n are connected in G n . Hence, if B' is the union 
of all big components in G' n , and B is the corresponding set of vertices in 
G n , we see that whp B is connected in G n , and, using Lemma l5?3l for G' n , 

Ci{G n ) > \B\ > \B'\ - (d - l)n l ~ 5 = rn + o p (n) = rn + o p (n). (5.13) 



Combining (15. 13ft and f)5. 12|) we obtain C\{G n ) = rn + o p (n). 
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Finally, we observe that if C<z(G n ) > to(n), then N >w ^{G n ) > C\{G n ) + 
C 2 (G n ), and thus, by (foTTUl) and (|5T3l . 

C2{G n ) < max(w(n),iV> a; ( n )(G n ) - Ci(G n )) = o p (n). rj 

5.2. The simple random graph G(n, (dj)"). We transfer the results to 
the simple random graph G(n, (<ij)i) by the following result proved in [9]; 
see also e.g. Bollobas [I] and McKay [13] for earlier versions. 

Lemma 5.5. If Conditions [2TJ\ and [2~3\ hold, then 

liminf F(G*(n, (di)i) is a simple graph) > 0. 

n— *oo 

All results for G*(n, (dj)™) that can be stated in terms of convergence in 
probability, as our results in this section, thus hold also if we condition on 
the graph being simple. In other words, the results proved for G*(n, (di)") 
hold for G(n,(di)i) too. Thus, Theorem 15.41 has the following version for 
G(n,(di)?). 

Theorem 5.6. Assume that Conditions \2.1\ and \2.3\ hold. Then 

C l {G(n,(d i ) n l ))=Tn + o p (n), 
C 2 (G(n,(di)1))=o p (n). 

5.3. Uniform vaccination. We now extend Theorem 15.41 to the graph 
G*(n, (di)i)\}. p where < v < 1 and < p < 1, see Section[2j Recall that we 
obtain this graph from G*(n, (di)™) by randomly and independently deleting 
edges with probability 1— p (non-transmission) and vertices with probability 
v (vaccination). The branching process approximation arguments above still 
work, with the difference that each new individual found is kept with prob- 
ability p(l — v), and otherwise discarded. Hence the offspring distribution 
is changed from D — 1 to X v ~ MixBi(D — l,p(l — v)), and the branch- 
ing process corresponding to an unvaccinated person with d friends starts 
with Bi(d,p(l — v)) individuals. Let now X d denote the branching process 
with this offspring distribution, starting with d individuals. The probability 
generating function of X v is, as shown in Subsections 13.11 and 13.21 given by 

F A f' D (l-p(l-v)(l-t)) 

/d(1) 

Hence, the extinction probability of X 1 is 7r^ given by (|3.6|) , If we start 
the branching process with D' ~ Bi(<i,p(l — v)) individuals, the extinction 
probability is thus, writing p = p(l — v), 

- e (ty a -p) d -\< P ) k = (i-p+p< P ) d - 

The arguments in the proofs of Lemmas 15.21 and 15.31 show, recalling that 
each vertex has probability 1 — v of being unvaccinated, that (|5.1ip holds in 
the form 

N> u{n)td /n^p d (l-v)(l-^), 
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for every fixed d > 0, assuming u(n) — > oo and u(n)/n — ► 0. Hence, 
N> u[n) /{n{l-v)) £ 5> d (l-7r«>) = l-E^^ = WdCI-P+pO 

This limit equals := 1 — 7r^. p with 7r^. p given by (|3.7p . 

To extend Theorem 15. 4| it remains to show that there is only one very 
large component. More precisely, we show again that, with u(n) = ra 0,9 , 
there is whp only one big component. We argue as for Theorem l5.41 splitting 
some vertices of degree d in G n = G*(n, (dj)") into d red vertices of degree 
1, calling the resulting graph G' n . 

We vaccinate the vertices in G' n with probability v each, independently; 
we then recombine the red vertices to vertices of degree d in G n and consider 
each such vertex as vaccinated if at least one of its red parts in G' n is. This 
means that some vertices in G n are vaccinated with probability larger than 
v, but this does not hurt since the aim of the argument is to provide a lower 
bound for C\, the size of the largest component, and any extra vaccinations 
can only decrease C\. 

By a Chernoff bound, there are whp at least (1 — u)n 1 ~ <5 unvaccinated red 
vertices, and it follows as before that whp every big component of {G' n ))~. p 
contains at least C2n ' 9-5 red vertices (although the value of ci may change). 
Given two big components K% and K% it follows similarly as before that with 
probability 1 — o(n~ 2 ) there exists a vertex in G n that is split into d red 
vertices, of which at least one is in K\, at least one in K2, and all are 
unvaccinated. The proof is completed as before. 

Consequently, using also Lemma [5. 5\ we have the following theorem. The- 
orems 13.21 and 13.11 (the special case v = 0) are immediate consequences. 

Theorem 5.7. Assume Condition \2.1\ and let < p < 1, < v < 1. Then, 

C 1 (G*{n, (d t )X P ) = A p n{l -v) + o p (n), 

C 2 {G*(n,(d l ) n 1 t, P )=o p (n), 

where T^. p = 1 — 7r^. p with 7r^. p given by (|3.7p . // also Condition \2.3\ holds, 
then the same results hold for G(n, (di)i)\j. p too. 

5.4. Acquaintance vaccination. As explaind in Subsection 13.31 in order 
to obtain (asymptotically) a Galton-Watson branching process, with the 
right independence properties, we consider directed edges, or equivalently 
half-edges, that are open, i.e. transmission may take place but the edge is 
not used for vaccination. Moreover, we consider only open edges originating 
at an unvaccinated person. 

Let x be a given vertex with degree d in G*{n, (dj)™), and let us explore 
the component of x in G*(n, (di)i)^. p , conditioned on x being unvaccinated 
(otherwise x does not belong to G*(n, {di)i)^. p ). In order to be kept in 
G*(n,(di)i)c. p , an edge has to be open, but not all edges are kept since 
some may lead to vertices that are vaccinated, see Figure [it). Nevertheless, 
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we consider all open edges found during the exploration. We declare the 
open edges starting at x to be active. We then investigate the active edges. 
If an active edge leads to a person that is unvaccinated, we declare the open 
edges going from that person, except the one going back to where we just 
came from, to be new active edges. We continue until no more active edges 
are found; we then have found the component containg x (plus some extra 
open edges leading to vaccinated persons). 

We investigate this process probabilistically, revealing the structure of 
G*(n,(di)i) by combining half-edges at random as we proceed the explo- 
ration. We consider asymptotics as n —* oo, and some of the statements 
below are only approximatively correct for finite n. 

Note first that each of the d edges leading from x is open with probability 
pe~ c / d , independently of each other, so we start with Bi(d, pe~ c ^ d ) open 
edges. 

The vertex x has d friends; in G*(n, (rfj)i) they are chosen by randomly 
choosing d half-edges and their degrees have the size-biased distribution (pj ) , 
independently of each other. Conditioning on x being unvaccinated means 
that we condition on none of the d edges being used for vaccination in the 
opposite direction. Since the probability that a friend with degree j does 
not name x is e~ c ^ , this preserves the independence of the degrees of the 
friends, but shifts their distribution to, as asserted in (|3.1ip . (pje~ c ^ /a)j, 
where a = a(c) = YljPj e ~ c ^ as i n <|3.8|) is the probability of not being 
named by a random friend. 

Now suppose that an open edge goes from a; to a friend y of degree k. In 
order for this to define an edge in G*(n, (di)i)^. p , y must not be vaccinated 
through another of its friends; this has the probability a k ~ l . In this case, y 
has k — 1 further edges, and each of them is open with probability pe~ c l k . 
It follows that the number of new open edges at y has a distribution that 
is the mixture (1 — a k ~ 1 )5o + a k ~ 1 Bi(/c — l,pe~ c ^ k ). Using the distribution 
(|3.1ip for the degree of y, we finally see that the distribution of the number 
Y of new active edges found when exploring a single active edge is given by 
flSIZD . 

Hence, observing obvious independence properties, the process of active 
edges is (asymptotically) a Galton- Watson branching process with offspring 
distribution Y, starting with Bi(d,pe~ c ^ d ) active edges. Denote his branch- 
ing process by £W. Let, as in Subsection 13.31 TT^. p by the probability that 
a branching proess with this offspring distribution Y and starting with a 
single individual dies out. Then, the extinction probability of 3£W is 



tt« := P(\X^\ < oc) = ]T ( d )(pe- c/d y(l-pe- c / d ) d ^(^ p ) 



3=0 
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A minor complication is that the branching process approximation counts 
open edges and, as remarked above, not all open edges lead to vertices in 
G*(n, (c4')i)cV Thus (j5.2f) does not extend directly. However, we still have 
the inequality 

P(|C(x)| > k) < P(|£ (d) | > k- 1) + o(l). 

Furthermore, a vertex of degree d in G*{n, (dj)" ) is unvaccinated with prob- 
abilty a d , and thus 

E(iV> M ) <n d a p (P(\X^\ > k - 1) + o(l)) , 

which arguing as in (|5.6j) and (|5.7p leads to 

limsu P E(iV> w(n)id /n) <p d a p P(|£^| = oo) = Pd a p (l - vr^). (5.14) 

n— >oo 

For a lower bound, we note that an open edge creates new open edges in 
the exploration process only if it leads to an unvaccinated person. Hence, 
if f(X^) denotes the number of individuals in the branching process 
with at least one child, we have, for every k > 1, 

P(|C(x)| > k) > P(/(£^) > k - 1) + o(l). 
In order to replace the fixed k by w(n), we do as in the proof of Lemma 15.21 
and define a Galton- Watson process \ now starting with Bi(d,pe _c / d (l — 
v^ 1 )) individuals and with an offspring distribution Y v on {0, . . . , u} with 
F(Y U = j) = (1 - P(Y = j) for j = 1, . . . , v. 

For each v and each fixed A < oo, we can for large n couple the exploration 
process and 3L^ as in the proof of Lemma 15.21 as long as we have found at 
most Auj{n) open edges. Hence, if \C[x)\ < co(n), then either /(xl ) < u[n) 
or the process 3ti reaches more than Au(n) individuals while less than 
uj(n) of them, plus the root, have had children. The probability of the latter 
event is at most, since the root has at most d children, 

ui(n) 

p(l + d+^y^>A W (n)), 
i=l 

where Y^ are independent random variables with the distribution C(Y \ 
Y > 0), and thus this probability tends to by the law of large numbers 
provided we have chosen A > E(Y | Y > 0). 
Consequently, 

P(|C(x)| < u(n)) < P(/(X^) < w(n)) + o(l) < P(/(X^) < oo) + o(l). 

Using again that a person with degree (i is unvaccinated with probability 
a d , it follows that 

EJV> w( n),d > ^« P (1 " < ~) + o(l)) 

and thus 

IiminfE(JV >w(n))d /n) >p d ^P(|XW| = oo). 
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We let v — > oo and obtain by Lemma 14.11 

liminfE(iV >£j(ri) j n ) > VaP PY{\?L^\ = oo) = Pd cJ>{l - tt^), 

which together with (|5.14p yields 

E(N^ (nU /n)^ Pd aP(l-n^). 
Arguing as in the proof of Lemma 15.31 we find a ^ so 

and, recalling (13.150 and ()3.9|) . 

N>u>(n)/n A Y.P^i 1 - * {d) ) = E^Cl - *£p) = (1 - ^))t c A p , 

d d 

with = 1 — 7r^ p . In particular, 

d(G*(n, (di)?)Jp) < u(n) + JV> u(n) < (1 - ^(c))r c A p n + o p (n). 

Finally, we argue again as in the proof of Theorem 15.41 to show that most 
vertices in large components belong to a single component. We split some 
of the vertices in G n = G*(n, as above and perform acquaintance 

vaccination on the resulting graph G' n . This corresponds to acquaintance 
vaccination on G n , except that the vertices that are split now are asked to 
name a friend Po(dc) times instead of Po(c). We perform thus some extra 
vaccinations, but this can only decrease C\ and we obtain as in (|5.13|) the 
lower bound 

Ca(G*(n,K)?4) ^ (l-<c))r c > + 0p (n). 

Summing up, and using Lemma [5. 5 1 we have shown he following theorem. 
Theorem 13.31 is an immediate consequence. 

Theorem 5.8. Assume Condition \2.1l and let < p < 1, < c < oo. 

Then, 

C x {G*{n, (di)?)Jp) = r c >(l - v(c)) + o p (n), 

C 2 (G*(n,(d i )^. p )=o p (n), 

where r^ p = 1 — ir^. p with 7r^ p given by (13.150 . If also Condition \2.3\ holds, 
then the same results hold for G(n, {di)i)^. p too. 

5.5. Edgewise vaccination. We argue as for acquaintance vaccination 
with the modifications (simplifications) explained in Subsection 13.41 There 
are no new complications, and we obtain the following. Theorem 13.41 is an 
immediate consequence. 

Theorem 5.9. Assume Condition \2.1\ and let < p < 1, < a < 1. Then, 
for j = 1,2, 

d (G*(n, (d^)% p ) = T% p n(l - v(a)) + o p (n), 
C 2 {G*(n,(d i r i )% p )=o p (n), 
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where T a . p = 1 — n a . p with ir^. p given by (|3.18p . // also Condition \2.3l holds, 
then the same results hold for G(n, {di)1) a . p too. 
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